Pesquisa | Portal Regional da BVS

1.

Mapping of Alzheimer's disease related data elements and the NIH Common Data Elements.

Hao, Xubing; Abeysinghe, Rashmie; Zheng, Fengbo; Schulz, Paul E; Cui, Licong.

BMC Med Inform Decis Mak ; 24(Suppl 3): 103, 2024 Apr 19.

Artigo em Inglês | MEDLINE | ID: mdl-38641585

RESUMO

BACKGROUND: Alzheimer's Disease (AD) is a devastating disease that destroys memory and other cognitive functions. There has been an increasing research effort to prevent and treat AD. In the US, two major data sharing resources for AD research are the National Alzheimer's Coordinating Center (NACC) and the Alzheimer's Disease Neuroimaging Initiative (ADNI); Additionally, the National Institutes of Health (NIH) Common Data Elements (CDE) Repository has been developed to facilitate data sharing and improve the interoperability among data sets in various disease research areas. METHOD: To better understand how AD-related data elements in these resources are interoperable with each other, we leverage different representation models to map data elements from different resources: NACC to ADNI, NACC to NIH CDE, and ADNI to NIH CDE. We explore bag-of-words based and word embeddings based models (Word2Vec and BioWordVec) to perform the data element mappings in these resources. RESULTS: The data dictionaries downloaded on November 23, 2021 contain 1,195 data elements in NACC, 13,918 in ADNI, and 27,213 in NIH CDE Repository. Data element preprocessing reduced the numbers of NACC and ADNI data elements for mapping to 1,099 and 7,584 respectively. Manual evaluation of the mapping results showed that the bag-of-words based approach achieved the best precision, while the BioWordVec based approach attained the best recall. In total, the three approaches mapped 175 out of 1,099 (15.92%) NACC data elements to ADNI; 107 out of 1,099 (9.74%) NACC data elements to NIH CDE; and 171 out of 7,584 (2.25%) ADNI data elements to NIH CDE. CONCLUSIONS: The bag-of-words based and word embeddings based approaches showed promise in mapping AD-related data elements between different resources. Although the mapping approaches need further improvement, our result indicates that there is a critical need to standardize CDEs across these valuable AD research resources in order to maximize the discoveries regarding AD pathophysiology, diagnosis, and treatment that can be gleaned from them.

Assuntos

Doença de Alzheimer , Estados Unidos/epidemiologia , Humanos , Doença de Alzheimer/diagnóstico por imagem , Doença de Alzheimer/epidemiologia , Elementos de Dados Comuns , Neuroimagem , National Institutes of Health (U.S.)

2.

Knowledge Representation and Management 2022: Findings in Ontology Development and Applications.

Charlet, Jean; Cui, Licong.

Yearb Med Inform ; 32(1): 225-229, 2023 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-38147864

RESUMO

OBJECTIVES: To select, present, and summarize the best papers in 2022 for the Knowledge Representation and Management (KRM) section of the International Medical Informatics Association (IMIA) Yearbook. METHODS: We conducted PubMed queries and followed the IMIA Yearbook guidelines for performing biomedical informatics literature review to select the best papers in KRM published in 2022. RESULTS: We retrieved 1,847 publications from PubMed. We nominated 15 candidate best papers, and two of them were finally selected as the best papers in the KRM section. The topics covered by the candidate papers include ontology and knowledge graph creation, ontology applications, ontology quality assurance, ontology mapping standard, and conceptual model. CONCLUSIONS: In the KRM best paper selection for 2022, the candidate best papers encompassed a broad range of topics, with ontology and knowledge graph creation remaining a considerable research focus.

Assuntos

Informática Médica , Gestão do Conhecimento

3.

An ontology-based approach for harmonization and cross-cohort query of Alzheimer's disease data resources.

Hao, Xubing; Li, Xiaojin; Zhang, Guo-Qiang; Tao, Cui; Schulz, Paul E; Cui, Licong.

BMC Med Inform Decis Mak ; 23(Suppl 1): 151, 2023 08 04.

Artigo em Inglês | MEDLINE | ID: mdl-37542312

RESUMO

BACKGROUND: In the United States, the National Alzheimer's Coordinating Center (NACC) and the Alzheimer's Disease Neuroimaging Initiative (ADNI) are two major data sharing resources for Alzheimer's Disease (AD) research. NACC and ADNI strive to make their data more FAIR (findable, interoperable, accessible and reusable) for the broader research community. However, there is limited work harmonizing and supporting cross-cohort interoperability of the two resources. METHOD: In this paper, we leverage an ontology-based approach to harmonize data elements in the two resources and develop a web-based query system to search patient cohorts across the two resources. We first mapped data elements across NACC and ADNI, and performed value harmonization for the mapped data elements with inconsistent permissible values. Then we built an Alzheimer's Disease Data Element Ontology (ADEO) to model the mapped data elements in NACC and ADNI. We further developed a prototype cross-cohort query system to search patient cohorts across NACC and ADNI. RESULTS: After manual review, we found 172 mappings between NACC and ADNI. These 172 mappings were further used to construct common concepts in ADEO. Our data element mapping and harmonization resulted in five files storing common concepts, variables in NACC and ADNI, mappings between variables and common concepts, permissible values of categorical type data elements, and coding inconsistency harmonization, respectively. Our cross-cohort query system consists of three core architectural elements: a web-based interface, an advanced query engine, and a backend MongoDB database. CONCLUSIONS: In this work, ADEO has been specifically designed to facilitate data harmonization and cross-cohort query of NACC and ADNI data resources. Although our prototype cross-cohort query system was developed for exploring NACC and ADNI, its backend and frontend framework has been designed and implemented to be generally applicable to other domains for querying patient cohorts from multiple heterogeneous data sources.

Assuntos

Doença de Alzheimer , Humanos , Estados Unidos , Doença de Alzheimer/diagnóstico por imagem , Neuroimagem

4.

Controlling amorphous silicon in scratching for fabricating high-performance micromixers.

Chen, Tingting; Cui, Licong; He, Wang; Liu, Renxing; Feng, Chengqiang; Wu, Lei; Wang, Yang; Liu, Huiyun; Qian, Linmao; Yu, Bingjun.

Lab Chip ; 23(17): 3794-3801, 2023 Aug 22.

Artigo em Inglês | MEDLINE | ID: mdl-37498210

RESUMO

As core parts of microfluidic chip analysis systems, micromixers show robust applications in wide fields. However, restricted by the fabrication technology, it remains challenging to achieve high-quality micromixers with both delicately designed structure and efficient mixing. In this study, based on the theory of chaotic mixing, sinusoidal structures with variable phases were designed and then fabricated through scanning probe lithography (SPL) and post-selective etching. It was found that scratches with phase differences can lead to the periodic formation of amorphous silicon (a-Si), which can resist etching. Consequentially, misaligned sine channels with thick-thin alternating 3D shapes can be generated in situ from the scratched traces after the etching. Further analysis showed that a thicker a-Si layer can be obtained by reducing the line spacing in the scratching, confirmed by Raman detections and simulations. With the proposed method, the misaligned sine micromixer was achieved with higher mixing efficiency than ever. The duplicating process was also investigated for high-precision production of micromixers. The study provided strategies for the miniaturization of high-performance microfluidic chips.

5.

Nanoscratch-induced formation of metallic micro/nanostructures with resin masks.

Xin, Mingyong; Feng, Qihui; Xu, Changbao; Cui, Licong; Zhu, Jie; Gan, Yinkai; Yu, Bingjun.

Discov Nano ; 18(1): 78, 2023 May 27.

Artigo em Inglês | MEDLINE | ID: mdl-37382849

RESUMO

Metallic micro/nanostructures present a wide range of applications due to the small size and superior performances. In order to obtain high-performance devices, it is of great importance to develop new methods for preparing metallic micro/nanostructures with high quality, low cost, and precise position. It is found that metallic micro/nanostructures can be obtained by scratch-induced directional deposition of metals on silicon surface, where the mask plays a key role in the process. This study is focused on the preparation of keto-aldehyde resin masks and their effects on the formation of scratch-induced gold (Au) micro/nanostructures. It is also found that the keto-aldehyde resin with a certain thickness can act as a satisfactory mask for high-quality Au deposition, and the scratches produced under lower normal load and less scratching cycles are more conducive to the formation of compact Au structures. According to the proposed method, two-dimensional Au structures can be prepared on the designed scratching traces, providing a feasible path for fabricating high-quality metal-based sensors.

6.

A UMLS-based Investigation of Laterality in Biomedical Terminologies.

Abeysinghe, Rashmie; Hao, Xubing; Cui, Licong; Zhang, Guo-Qiang.

AMIA Jt Summits Transl Sci Proc ; 2023: 16-24, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37350887

RESUMO

Laterality is an important anatomic directional property indicating the sidedness of body structures, diseases, and procedures. Errors in laterality could have catastrophic consequences in patient care. In this paper, we investigate how different biomedical terminologies organize terms indicating laterality. We leverage the Unified Medical Language System (UMLS) to identify lateral terms in different terminologies. For each lateral term, we attempt to obtain other matched lateral terms and further analyze how they are interrelated. Our results indicated that only 1.68% of the matched lateral term-pairs are hierarchically related. It was also seen that 44.24% of matched-pairs were siblings. We found that in SNOMED CT, bilateral concepts were hierarchically related to both left and right lateral concepts different to most other terminologies. Further investigation revealed that the likely causes for these relations are how the logical definitions of SNOMED CT concepts are arranged.

7.

A Query Engine for Self-controlled Case Series, with an application to COVID-19 EHR data.

Li, Xiaojin; Huang, Yan; Cui, Licong; Zhang, Guo-Qiang.

AMIA Jt Summits Transl Sci Proc ; 2023: 350-359, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37350916

RESUMO

Self-controlled case series (SCCS) is a statistical method in epidemiological study design that uses individuals as their own controls, with comparisons made within the same individuals at different time points of observation. SCCS has been applied in settings where it is difficult to identify comparison or control groups. To provide computational support for SCCS, we introduce a query engine called Self-Controlled Case Query (SCCQ) and use it to extract cohorts of self-controlled case series from a large-scale COVID-19 Electronic Health Records (EHR) dataset. Visual summary of the queried population through the R-Shiny visualization framework offers SCCQ's query result dashboard to the researcher. SCCQ allows the export of query-generated raw data files with a portable format that researchers can extend to create more intricate and robust visualization capabilities without needing a high-level of technical or statistical background. Our validation and evaluation experiments uncovered COVID-19 outcomes to be consistent with existing research findings. With SCCQ, cohort exploration, data extraction, and information visualization can be provided for structured EHR data to lower the barrier for clinical and epidemiological research.

8.

Extracting Temporal Expressions of First Seizure Onset from Epilepsy Patient Discharge Summaries.

Tao, Shiqiang; Abeysinghe, Rashmie; De La Esperanza, Blanca Talavera; Lhatoo, Samden; Zhang, Guo-Qiang; Cui, Licong.

AMIA Jt Summits Transl Sci Proc ; 2023: 515-524, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37350927

RESUMO

Early onset of seizure is a potential risk factor for Sudden Unexpected Death in Epilepsy (SUDEP). However, the first seizure onset information is often documented as clinical narratives in epilepsy monitoring unit (EMU) discharge summaries. Manually extracting first seizure onset time from discharge summaries is time consuming and labor-intensive. In this work, we developed a rule-based natural language processing pipeline for automatically extracting the temporal information of patients' first seizure onset from EMU discharge summaries. We use the Epilepsy and Seizure Ontology (EpSO) as the core knowledge resource and construct 4 extraction rules based on 300 randomly selected EMU discharge summaries. To evaluate the effectiveness of the extraction pipeline, we apply the constructed rules on another 200 unseen discharge summaries and compare the results against the manual evaluation of a domain expert. Overall, our extraction pipeline achieved a precision of 0.75, recall of 0.651, and F1-score of 0.697. This is an encouraging initial result which will allow us to gain insights into potentially better-performing approaches.

9.

Logical definition-based identification of potential missing concepts in SNOMED CT.

Hao, Xubing; Abeysinghe, Rashmie; Roberts, Kirk; Cui, Licong.

BMC Med Inform Decis Mak ; 23(Suppl 1): 87, 2023 05 09.

Artigo em Inglês | MEDLINE | ID: mdl-37161566

RESUMO

BACKGROUND: Biomedical ontologies are representations of biomedical knowledge that provide terms with precisely defined meanings. They play a vital role in facilitating biomedical research in a cross-disciplinary manner. Quality issues of biomedical ontologies will hinder their effective usage. One such quality issue is missing concepts. In this study, we introduce a logical definition-based approach to identify potential missing concepts in SNOMED CT. A unique contribution of our approach is that it is capable of obtaining both logical definitions and fully specified names for potential missing concepts. METHOD: The logical definitions of unrelated pairs of fully defined concepts in non-lattice subgraphs that indicate quality issues are intersected to generate the logical definitions of potential missing concepts. A text summarization model (called PEGASUS) is fine-tuned to predict the fully specified names of the potential missing concepts from their generated logical definitions. Furthermore, the identified potential missing concepts are validated using external resources including the Unified Medical Language System (UMLS), biomedical literature in PubMed, and a newer version of SNOMED CT. RESULTS: From the March 2021 US Edition of SNOMED CT, we obtained a total of 30,313 unique logical definitions for potential missing concepts through the intersecting process. We fine-tuned a PEGASUS summarization model with 289,169 training instances and tested it on 36,146 instances. The model achieved 72.83 of ROUGE-1, 51.06 of ROUGE-2, and 71.76 of ROUGE-L on the test dataset. The model correctly predicted 11,549 out of 36,146 fully specified names in the test dataset. Applying the fine-tuned model on the 30,313 unique logical definitions, 23,031 total potential missing concepts were identified. Out of these, a total of 2,312 (10.04%) were automatically validated by either of the three resources. CONCLUSIONS: The results showed that our logical definition-based approach for identification of potential missing concepts in SNOMED CT is encouraging. Nevertheless, there is still room for improving the performance of naming concepts based on logical definitions.

Assuntos

Ontologias Biológicas , Pesquisa Biomédica , Humanos , Systematized Nomenclature of Medicine , Conhecimento , Idioma

10.

Lateral Bending of Ag Nanowires toward Controllable Manipulation.

Cui, Licong; Li, Jiaming; Zhou, Huaicheng; Wu, Lei; Yang, Dan; Liu, Huiyun; Qian, Linmao; Yu, Bingjun.

ACS Nano ; 17(10): 9255-9261, 2023 May 23.

Artigo em Inglês | MEDLINE | ID: mdl-37171168

RESUMO

Nanowires (NWs) provide opportunities for building high-performance sensors and devices at micro-/nanoscales. Directional movement and assembly of NWs have attracted extensive attention; however, controllable manipulation remains challenging partly due to the lack of understanding on interfacial interactions between NWs and substrates (or contacting probes). In the present study, lateral bending of Ag NWs was investigated under various bending angles and pushing velocities, and the mechanical performance corresponding to microstructures was clarified based on high-resolution transmission electron microscope (HRTRM) detections. The bending-angle-dependent fractures of Ag NWs were detected by an atomic force microscope (AFM) and a scanning electron microscope (SEM), and the fractures occurred when the bending angle was larger than 80°. Compared with an Ag substrate, Ag NWs exhibited a lower system stiffness according to the nanoindentation with an AFM probe. HRTRM observations indicated that there were grain boundaries inside Ag NWs, which would be contributors to the generation of fractures and cracks on Ag NWs during lateral bending and nanoindentation. This study provides a guide to controllably manipulate NWs and fabricate high-performance micro-/nanodevices.

11.

Population-Based Mini-Mental State Examination Norms in Adults of Mexican Heritage in the Cameron County Hispanic Cohort.

Bukhbinder, Avram S; Hinojosa, Miriam; Harris, Kristofer; Li, Xiaojin; Farrell, Christine M; Shyer, Madison; Goodwin, Nathan; Anjum, Sahar; Hasan, Omar; Cooper, Susan; Sciba, Lois; Vargas, Amanda Falk; Hunter, David H; Ortiz, Guadalupe J; Chung, Karen; Cui, Licong; Zhang, Guo-Qiang; Fisher-Hoch, Susan P; McCormick, Joseph B; Schulz, Paul E.

J Alzheimers Dis ; 92(4): 1323-1339, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-36872776

RESUMO

BACKGROUND: Accurately identifying cognitive changes in Mexican American (MA) adults using the Mini-Mental State Examination (MMSE) requires knowledge of population-based norms for the MMSE, a scale which has widespread use in research settings. OBJECTIVE: To describe the distribution of MMSE scores in a large cohort of MA adults, assess the impact of MMSE requirements on their clinical trial eligibility, and explore which factors are most strongly associated with their MMSE scores. METHODS: Visits between 2004-2021 in the Cameron County Hispanic Cohort were analyzed. Eligible participants were ≥18 years old and of Mexican descent. MMSE distributions before and after stratification by age and years of education (YOE) were assessed, as was the proportion of trial-aged (50-85- year-old) participants with MMSE <24, a minimum MMSE cutoff most frequently used in Alzheimer's disease (AD) clinical trials. As a secondary analysis, random forest models were constructed to estimate the relative association of the MMSE with potentially relevant variables. RESULTS: The mean age of the sample set (nâ=â3,404) was 44.4 (SD, 16.0) years old and 64.5% female. Median MMSE was 28 (IQR, 28-29). The percentage of trial-aged participants (nâ=â1,267) with MMSE <24 was 18.6% overall and 54.3% among the subset with 0-4 YOE (nâ=â230). The five variables most associated with the MMSE in the study sample were education, age, exercise, C-reactive protein, and anxiety. CONCLUSION: The minimum MMSE cutoffs in most phase III prodromal-to-mild AD trials would exclude a significant proportion of trial-aged participants in this MA cohort, including over half of those with 0-4 YOE.

Assuntos

Doença de Alzheimer , Testes de Estado Mental e Demência , Americanos Mexicanos , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Masculino , Doença de Alzheimer/diagnóstico , Doença de Alzheimer/psicologia , Escolaridade , Americanos Mexicanos/psicologia , Texas , Valores de Referência , Adulto , Pessoa de Meia-Idade

12.

A deep learning approach to identify missing is-a relations in SNOMED CT.

Abeysinghe, Rashmie; Zheng, Fengbo; Bernstam, Elmer V; Shi, Jay; Bodenreider, Olivier; Cui, Licong.

J Am Med Inform Assoc ; 30(3): 475-484, 2023 02 16.

Artigo em Inglês | MEDLINE | ID: mdl-36539234

RESUMO

OBJECTIVE: SNOMED CT is the largest clinical terminology worldwide. Quality assurance of SNOMED CT is of utmost importance to ensure that it provides accurate domain knowledge to various SNOMED CT-based applications. In this work, we introduce a deep learning-based approach to uncover missing is-a relations in SNOMED CT. MATERIALS AND METHODS: Our focus is to identify missing is-a relations between concept-pairs exhibiting a containment pattern (ie, the set of words of one concept being a proper subset of that of the other concept). We use hierarchically related containment concept-pairs as positive instances and hierarchically unrelated containment concept-pairs as negative instances to train a model predicting whether an is-a relation exists between 2 concepts with containment pattern. The model is a binary classifier leveraging concept name features, hierarchical features, enriched lexical attribute features, and logical definition features. We introduce a cross-validation inspired approach to identify missing is-a relations among all hierarchically unrelated containment concept-pairs. RESULTS: We trained and applied our model on the Clinical finding subhierarchy of SNOMED CT (September 2019 US edition). Our model (based on the validation sets) achieved a precision of 0.8164, recall of 0.8397, and F1 score of 0.8279. Applying the model to predict actual missing is-a relations, we obtained a total of 1661 potential candidates. Domain experts performed evaluation on randomly selected 230 samples and verified that 192 (83.48%) are valid. CONCLUSIONS: The results showed that our deep learning approach is effective in uncovering missing is-a relations between containment concept-pairs in SNOMED CT.

Assuntos

Aprendizado Profundo , Systematized Nomenclature of Medicine

13.

A GCN-based approach to uncover misaligned synonymous terms in the UMLS Metathesaurus.

Hao, Xubing; Abeysinghe, Rashmie; Shi, Jay; Cui, Licong.

AMIA Annu Symp Proc ; 2023: 977-986, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-38222357

RESUMO

The Unified Medical Language System (UMLS), a large repository of biomedical vocabularies, has been used for supporting various biomedical applications. Ensuring the quality of the UMLS is critical to maintain both the accuracy of its content and the reliability of downstream applications. In this work, we present a Graph Convolutional Network (GCN)-based approach to identify misaligned synonymous terms organized under different UMLS concepts. We used synonymous terms grouped under the same concept as positive samples and top lexically similar terms as negative samples to train the GCN model. We applied the model to a test set and suggested those negative samples predicted to be synonymous as potentially misaligned synonymous terms. A total of 147,625 suggestions were made. A human expert evaluated 100 randomly selected suggestions and agreed with 60 of them. The results indicate that our GCN-based approach shows promise to help improve the synonymy grouping in the UMLS.

Assuntos

Unified Medical Language System , Humanos , Reprodutibilidade dos Testes

14.

Application of an ontology for model cards to generate computable artifacts for linking machine learning information from biomedical research.

Amith, Muhammad Tuan; Cui, Licong; Roberts, Kirk; Tao, Cui.

Proc Int World Wide Web Conf ; 2023(Companion): 820-825, 2023 Apr.

Artigo em Inglês | MEDLINE | ID: mdl-38327770

RESUMO

Model card reports provide a transparent description of machine learning models which includes information about their evaluation, limitations, intended use, etc. Federal health agencies have expressed an interest in model cards report for research studies using machine-learning based AI. Previously, we have developed an ontology model for model card reports to structure and formalize these reports. In this paper, we demonstrate a Java-based library (OWL API, FaCT++) that leverages our ontology to publish computable model card reports. We discuss future directions and other use cases that highlight applicability and feasibility of ontology-driven systems to support FAIR challenges.

15.

Knowledge Representation and Management: Notable Contributions in 2021.

Cui, Licong; Dhombres, Ferdinand; Charlet, Jean.

Yearb Med Inform ; 31(1): 236-240, 2022 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-36463882

RESUMO

OBJECTIVES: To select, present, and summarize the best papers in the field of Knowledge Representation and Management (KRM) published in 2021. METHODS: Following the International Medical Informatics Association (IMIA) Yearbook guidelines, a comprehensive and standardized review of the biomedical informatics literature was performed to select the best KRM papers published in 2021, based on PubMed queries. RESULTS: A total of 1,231 publications were retrieved from PubMed. We nominated 15 candidate best papers, and four of them were finally selected as the best papers in the KRM section. The topics covered by these papers include knowledge graph, ontology development, ontology alignment, and the International Classification of Diseases. CONCLUSION: In the KRM best paper selection for 2021, the candidate best papers covered a wider spectrum of topics compared to the last year's significant focus on ontology curation. In particular, ontology development for specific domains (e.g., Alzheimer's disease, infectious diseases, bioethics) has received the most attention.

Assuntos

Bioética , Informática Médica , Classificação Internacional de Doenças , Gestão do Conhecimento

16.

A multimodal clinical data resource for personalized risk assessment of sudden unexpected death in epilepsy.

Li, Xiaojin; Tao, Shiqiang; Lhatoo, Samden D; Cui, Licong; Huang, Yan; Hampson, Johnson P; Zhang, Guo-Qiang.

Front Big Data ; 5: 965715, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36059922

RESUMO

Epilepsy affects ~2-3 million individuals in the United States, a third of whom have uncontrolled seizures. Sudden unexpected death in epilepsy (SUDEP) is a catastrophic and fatal complication of poorly controlled epilepsy and is the primary cause of mortality in such patients. Despite its huge public health impact, with a ~1/1,000 incidence rate in persons with epilepsy, it is an uncommon enough phenomenon to require multi-center efforts for well-powered studies. We developed the Multimodal SUDEP Data Resource (MSDR), a comprehensive system for sharing multimodal epilepsy data in the NIH funded Center for SUDEP Research. The MSDR aims at accelerating research to address critical questions about personalized risk assessment of SUDEP. We used a metadata-guided approach, with a set of common epilepsy-specific terms enforcing uniform semantic interpretation of data elements across three main components: (1) multi-site annotated datasets; (2) user interfaces for capturing, managing, and accessing data; and (3) computational approaches for the analysis of multimodal clinical data. We incorporated the process for managing dataset-specific data use agreements, evidence of Institutional Review Board review, and the corresponding access control in the MSDR web portal. The metadata-guided approach facilitates structural and semantic interoperability, ultimately leading to enhanced data reusability and scientific rigor. MSDR prospectively integrated and curated epilepsy patient data from seven institutions, and it currently contains data on 2,739 subjects and 10,685 multimodal clinical data files with different data formats. In total, 55 users registered in the current MSDR data repository, and 6 projects have been funded to apply MSDR in epilepsy research, including three R01 projects and three R21 projects.

17.

Identification of missing hierarchical relations in the vaccine ontology using acquired term pairs.

Manuel, Warren; Abeysinghe, Rashmie; He, Yongqun; Tao, Cui; Cui, Licong.

J Biomed Semantics ; 13(1): 22, 2022 08 13.

Artigo em Inglês | MEDLINE | ID: mdl-35964149

RESUMO

BACKGROUND: The Vaccine Ontology (VO) is a biomedical ontology that standardizes vaccine annotation. Errors in VO will affect a multitude of applications that it is being used in. Quality assurance of VO is imperative to ensure that it provides accurate domain knowledge to these downstream tasks. Manual review to identify and fix quality issues (such as missing hierarchical is-a relations) is challenging given the complexity of the ontology. Automated approaches are highly desirable to facilitate the quality assurance of VO. METHODS: We developed an automated lexical approach that identifies potentially missing is-a relations in VO. First, we construct two types of VO concept-pairs: (1) linked; and (2) unlinked. Each concept-pair further derives an Acquired Term Pair (ATP) based on their lexical features. If the same ATP is obtained by a linked concept-pair and an unlinked concept-pair, this is considered to indicate a potentially missing is-a relation between the unlinked pair of concepts. RESULTS: Applying this approach on the 1.1.192 version of VO, we were able to identify 232 potentially missing is-a relations. A manual review by a VO domain expert on a random sample of 70 potentially missing is-a relations revealed that 65 of the cases were valid missing is-a relations in VO (a precision of 92.86%). CONCLUSIONS: The results indicate that our approach is highly effective in identifying missing is-a relation in VO.

Assuntos

Ontologias Biológicas , Vacinas , Trifosfato de Adenosina

18.

Towards quality improvement of vaccine concept mappings in the OMOP vocabulary with a semi-automated method.

Abeysinghe, Rashmie; Black, Adam; Kaduk, Denys; Li, Yupeng; Reich, Christian; Davydov, Alexander; Yao, Lixia; Cui, Licong.

J Biomed Inform ; 134: 104162, 2022 10.

Artigo em Inglês | MEDLINE | ID: mdl-36029954

RESUMO

The Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) provides a unified model to integrate disparate real-world data (RWD) sources. An integral part of the OMOP CDM is the Standardized Vocabularies (henceforth referred to as the OMOP vocabulary), which enables organization and standardization of medical concepts across various clinical domains of the OMOP CDM. For concepts with the same meaning from different source vocabularies, one is designated as the standard concept, while the others are specified as non-standard or source concepts and mapped to the standard one. However, due to the heterogeneity of source vocabularies, there may exist mapping issues such as erroneous mappings and missing mappings in the OMOP vocabulary, which could affect the results of downstream analyses with RWD. In this paper, we focus on quality assurance of vaccine concept mappings in the OMOP vocabulary, which is necessary to accurately harness the power of RWD on vaccines. We introduce a semi-automated lexical approach to audit vaccine mappings in the OMOP vocabulary. We generated two types of vaccine-pairs: mapped and unmapped, where mapped vaccine-pairs are pairs of vaccine concepts with a "Maps to" relationship, while unmapped vaccine-pairs are those without a "Maps to" relationship. We represented each vaccine concept name as a set of words, and derived term-difference pairs (i.e., name differences) for mapped and unmapped vaccine-pairs. If the same term-difference pair can be obtained by both mapped and unmapped vaccine-pairs, then this is considered as a potential mapping inconsistency. Applying this approach to the vaccine mappings in OMOP, a total of 2087 potentially mapping inconsistencies were obtained. A randomly selected 200 samples were evaluated by domain experts to identify, validate, and categorize the inconsistencies. Experts identified 95 cases revealing valid mapping issues. The remaining 105 cases were found to be invalid due to the external and/or contextual information used in the mappings that were not reflected in the concept names of vaccines. This indicates that our semi-automated approach shows promise in identifying mapping inconsistencies among vaccine concepts in the OMOP vocabulary.

Assuntos

Vacinas , Vocabulário , Melhoria de Qualidade , Vocabulário Controlado

19.

DaT3M: A Data Tracker for Multi-faceted Management of Multi-site Clinical Research Data Submission, Curation, Master Inventorying, and Sharing.

Tao, Shiqiang; Cui, Licong; Chou, Wei-Chun; Lhatoo, Samden; Zhang, Guo-Qiang.

AMIA Jt Summits Transl Sci Proc ; 2022: 466-475, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35854726

RESUMO

Managing research data is an important and challenging aspect of clinical studies, especially for multi-site collaboratives. To address this challenge, we designed, developed and deployed a multi-faceted, multi-level interactive data tracker (DaT3M) for multi-site clinical research data submission, curation, master inventorying, and sharing. Components of DaT3M include data overview, data portal, data status panel, data query engine, and data downloader. DaT3M managed clinical research data for the Center for SUDEP Research (CSR). The CSR instance of DaT3M includes 2,743 subjects from seven data contributing institutions, 7 data modalities and 10,678 data components: 3,398 Epilepsy Monitoring Unit reports, 3,440 electroencephalography recordings, 629 MRI imaging datasets, 177 bio-chemistry datasets, 722 DNA datasets, 2,289 follow-up forms, and 30 SUDEP forms. Preliminary, structured, one-on-one usability evaluations were performed with 7 researchers from four institutions. System Usability Score reached 85.3, showing that DaT3M has achieved high levels of user satisfaction based on our pilot evaluation.

20.

Toward a standard formal semantic representation of the model card report.

Amith, Muhammad Tuan; Cui, Licong; Zhi, Degui; Roberts, Kirk; Jiang, Xiaoqian; Li, Fang; Yu, Evan; Tao, Cui.

BMC Bioinformatics ; 23(Suppl 6): 281, 2022 Jul 14.

Artigo em Inglês | MEDLINE | ID: mdl-35836130

RESUMO

BACKGROUND: Model card reports aim to provide informative and transparent description of machine learning models to stakeholders. This report document is of interest to the National Institutes of Health's Bridge2AI initiative to address the FAIR challenges with artificial intelligence-based machine learning models for biomedical research. We present our early undertaking in developing an ontology for capturing the conceptual-level information embedded in model card reports. RESULTS: Sourcing from existing ontologies and developing the core framework, we generated the Model Card Report Ontology. Our development efforts yielded an OWL2-based artifact that represents and formalizes model card report information. The current release of this ontology utilizes standard concepts and properties from OBO Foundry ontologies. Also, the software reasoner indicated no logical inconsistencies with the ontology. With sample model cards of machine learning models for bioinformatics research (HIV social networks and adverse outcome prediction for stent implantation), we showed the coverage and usefulness of our model in transforming static model card reports to a computable format for machine-based processing. CONCLUSIONS: The benefit of our work is that it utilizes expansive and standard terminologies and scientific rigor promoted by biomedical ontologists, as well as, generating an avenue to make model cards machine-readable using semantic web technology. Our future goal is to assess the veracity of our model and later expand the model to include additional concepts to address terminological gaps. We discuss tools and software that will utilize our ontology for potential application services.

Assuntos

Ontologias Biológicas , Semântica , Inteligência Artificial , Biologia Computacional , Aprendizado de Máquina , Software

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA